Welcome![Sign In][Sign Up]
Location:
Search - java html

Search list

[SourceCodehtml.rar

Description: java编写的浏览器程序
Platform: | Size: 1372 | Author: | Hits:

[Internet-Network用Java编写HTML文件分析程序

Description:

Java编写HTML文件分析程序

 一、概述

    

    Web服务器的核心是对Html文件中的各标记(Tag)作出正确的分析,一种编程语言的解释程序也是对源文件中的保留字进行分析再做解释的。实际应用中,我们也经常会碰到需要对某一特定类型文件进行要害字分析的情况,比如,需要将某个HTML文件下载并同时下载与之相关的.gif.class等文件,此时就要求对HTML文件中的标记进行分离,找出所需的文件名及目录。在Java出现以前,类似工作需要对文件中的每个字符进行分析,从中找出所需部分,不仅编程量大,且易出错。笔者在近期的项目中利用Java的输入流类StreamTokenizer进行HTML文件的分析,效果较好。在此,我们要实现从已知的Web页面下载HTML文件,对其进行分析后,下载该页面中包含的HTML文件(假如在Frame中)、图像文件和ClassJava Applet)文件。

    

    二、StreamTokenizer

    

    StreamTokenizer即令牌化输入流的作用是将一个输入流中变成令牌流。令牌流中的令牌实体有三类:单词(即多字符令牌)、单字符令牌和空白(包括JavaC/C++中的说明语句)。

    

    StreamTokenizer类的构造器为: StreamTokenizer(InputStream in)

    

    该类有一些公有实例变量:ttypesvalnval ,分别表示令牌类型、当前字符串值和当前数字值。当我们需要取得令牌(即HTML中的标记)之间的字符时,应访问变量sval。而读向下一个令牌的方法是调用nextToken()。方法nextToken()的返回值是int型,共有四种可能的返回:

    

    StreamTokenizer.TT_NUMBER: 表示读到的令牌是数字,数字的值是double型,可以从实例变量nval中读取。

    

    StreamTokenizer.TT_Word: 表示读到的令牌是非数字的单词(其他字符也在其中),单词可以从实例变量sval中读取。

    

    StreamTokenizer.TT_EOL: 表示读到的令牌是行结束符。

    

    假如已读到流的尽头,则nextToken()返回TT_EOF

    

    开始调用nextToken()之前,要设置输入流的语法表,以便使分析器辨识不同的字符。WhitespaceChars(int low, int hi)方法定义没有意义的字符的范围。WordChars(int low, int hi)方法定义构造单词的字符范围。

    

    三、程序实现

    

    1HtmlTokenizer类的实现

    

    对某个令牌流进行分析之前,首先应对该令牌流的语法表进行设置,在本例中,即是让程序分出哪个单词是HTML的标记。下面给出针对我们需要的HTML标记的令牌流类定义,它是StreamTokenizer的子类:

    

    

    import java.io.*;

    import java.lang.String;

    class HtmlTokenizer extends

    StreamTokenizer {

    //定义各标记,这里的标记仅是本例中必须的,

    可根据需要自行扩充

     static int HTML_TEXT=-1;

     static int HTML_UNKNOWN=-2;

     static int HTML_EOF=-3;

     static int HTML_IMAGE=-4;

     static int HTML_FRAME=-5;

     static int HTML_BACKGROUND=-6;

     static int HTML_APPLET=-7;

    

    boolean outsideTag=true; //判定是否在标记之中

    

     //构造器,定义该令牌流的语法表。

     public HtmlTokenizer(BufferedReader r) {

    super(r);

    this.resetSyntax(); //重置语法表

    this.wordChars(0,255); //令牌范围为全部字符

    this.ordinaryChar('< '); //HTML标记两边的分割符

    this.ordinaryChar('>');

     } //end of constrUCtor

    

     public int nextHtml(){

    int token; //令牌

    try{

    switch(token=this.nextToken()){

    case StreamTokenizer.TT_EOF:

    //假如已读到流的尽头,则返回TT_EOF

    return HTML_EOF;

    case '< ': //进入标记字段

    outsideTag=false;

    return nextHtml();

    case '>': //出标记字段

    outsideTag=true;

    return nextHtml();

    case StreamTokenizer.TT_WORD:

    //若当前令牌为单词,判定是哪个标记

    if (allWhite(sval))

     return nextHtml(); //过滤其中空格

    else if(sval.toUpperCase().indexOf("FRAME")

    !=-1 && !outsideTag) //标记FRAME

     return HTML_FRAME;

    else if(sval.toUpperCase().indexOf("IMG")

    !=-1 && !outsideTag) //标记IMG

     return HTML_IMAGE;

    else if(sval.toUpperCase().indexOf("BACKGROUND")

    !=-1 && !outsideTag) //标记BACKGROUND

     return HTML_BACKGROUND;

    else if(sval.toUpperCase().indexOf("APPLET")

    !=-1 && !outsideTag) //标记APPLET

     return HTML_APPLET;

    default:

    System.out.println ("Unknown tag: "+token);

    return HTML_UNKNOWN;

     } //end of case

    }catch(IOException e){

    System.out.println("Error:"+e.getMessage());}

    return HTML_UNKNOWN;

     } //end of nextHtml

    

    protected boolean allWhite(String s){//过滤所有空格

    //实现略

     }// end of allWhite

    

    } //end of class

    

    以上方法在近期项目中测试通过,操作系统为Windows NT4,编程工具使用Inprise Jbuilder3


Platform: | Size: 1066 | Author: tiberxu | Hits:

[JSP/Javajava2html

Description: 该程序可将java源程序生成html文件,并用不同颜色标注语法。原程序无输出文件类,后由本人增加输出文件类-the procedure can be generated source java html document and marked with different colors syntax. The original procedure without output file type and then by the increased output file, I type
Platform: | Size: 3236 | Author: 中国 | Hits:

[Crack HackJavaFR_RSA_Source

Description: 基于java的完整的RSA算法实现 /** * <p>Titre : RSA </p> * <p>Description : Encodage de donn閑s selon le protocole RSA </p> * <p>Copyright : Copyright (c) 2004</p> * @author Fran鏾is Bradette * @version 1.1 * version originale de Robert Sedgewick and Kevin Wayne.Copyright ? 2004 * pris sur le site http://www.cs.princeton.edu/introcs/104crypto/RSA.java.html * Modifier par Fran鏾is Bradette */-on the integrity of the RSA algorithm / ** * lt; Pgt; Part : RSA lt; / Pgt; * Lt; Pgt; Description : Encodage de donn idle s selon le protocole RSA lt; / Pgt; * Lt; Pgt; Copyright : Copyright (c) 2004lt; / pgt; * @ author Fran is continuing Bradette * * @ version 1.1 version originale de Robert Sedgewick and Kevin Wayne.Copyright 2004 * pris sur le site http://www.cs.princeton. * Modifier edu/introcs/104crypto/RSA.java.html par Fran is continuing Bradette * /
Platform: | Size: 15588 | Author: 某男 | Hits:

[Other resource《Java教程宝典》

Description: Java语言是Internet上最热门的编程语言,本文针对 Java的网络功能,对Java从网络上获取图象、声音、 HTML文档及文本文件等编程方法作了初步的介绍,同 时介绍了动态获取网络上资源的方法作了介绍。文中 提供了大量简明易懂的实例。 -Java language on the Internet is the most popular programming language, Java is a network function, right from the Java network access images, audio, HTML files and text documents programming methods preliminary, introduction to a dynamic access to network resources on the method was introduced. The text provided a lot of examples easily understandable.
Platform: | Size: 798531 | Author: 陈晨 | Hits:

[OtherJAVA立体字渐出效果

Description: * 显示字符串,立体效果,用Html传递字符串 * 渐出立体字。采用变色、位移、慢出显示,产生效果。 * 用线程控制渐显效果-* Display string, three-dimensional effects, with Html pass strings * gradually characters three-dimensional. Use of color, displacement, slow out that effect. * Use thread control incipient effects
Platform: | Size: 3492 | Author: 邰科 | Hits:

[Windows Developjava制作(三星)_下载中心_网易

Description: JPad Pro 是一个完整的 Java Applications 和 Java Applets 的开发环境,并且支持 HTML 及其它类型的文件。-JPad Pro is a complete Java Applications and Java Applets development environment, and supports HTML, and other types of documents.
Platform: | Size: 8687 | Author: hf | Hits:

[Web Server用Java实现Web服务器

Description: 用Java实现Web服务器 本文实现了GET请求的Web服务器程序的方法,通过创建ServerSocket类对象,监听端口8080; 等待、接受客户机连接到端口8080; 创建与socket字相关联的输入流和输出流 然后,读取客户机的请求信息,若请求类型是GET,则从请求信息中获取所访问的HTML文件名,如果HTML文件存在,则打开HTML文件,把HTTP头信息和HTML文件内容通过socket传回给Web浏览器,然后关闭文件。否则发送错误信息给Web浏览器。最后,关闭与相应Web浏览器连接的socket字。-Java Web server is to achieve a GET request to the Web server, through the creation of ServerSocket class object, bugging port 8080; Wait, a client is connected to port 8080; Socket character creation and the associated input and output streams flow then read the client's request information, if the request is the type of GET, request information from being accessed visit HTML document, and if the HTML document exists, then open the HTML file, HTTP headers and HTML files through the socket sent back to the Web browser and then close the file. Otherwise, send the wrong message to the Web browser. Finally, the closing and the corresponding Web browser connected to the socket word.
Platform: | Size: 10425 | Author: 雨岳 | Hits:

[WEB Codehtml模板

Description: html的用户首页的模板 可用于自己博客的首页也可以给新手提供帮助(HTML's home page template can be used on the front page of your own blog, and can also help newcomers)
Platform: | Size: 1704960 | Author: hadleyV5 | Hits:

[JSP/Java6Nisan

Description: java application for learning
Platform: | Size: 285696 | Author: duygucet | Hits:

[JSP/JavaJavaWeb视频教程_day1-资料源码

Description: javaweb系列第一章,主要介绍html(JavaWeb series, Chapter 1, introduces HTML)
Platform: | Size: 16446464 | Author: hdad | Hits:

[OtherJava工具

Description: java常用查询工具,包含各种文档。如HTML、JDK、JQuery等,适用于初级程序员使用。(Java commonly used query tools, including a variety of documents, such as HTML, JDK, jQuery, etc., suitable for junior programmers to use.)
Platform: | Size: 48055296 | Author: 七01 | Hits:

[Otherpd4ml.jar

Description: java html导出pdf的文章有很多大多都使用的是itext,其实用过的都知道itext有时并不能满足我们的需求,不能兼容html的样式,而且从html页面导出的图片到pdf中也并不好处理。Flying Sauser实现html2pdf,纠错能力差,支持多种中文字体(部分样式不能识别),而且对html的格式也是十分的严格,如果使用一种模版的话使用Flying Sauser技术倒是不错的选择,但是对于不规则的html导出pdf就并不是那么的适用。这时我们就要考虑使用其他的技术,而PD4ML可以满足我们需求,PD4ML实现html2pdf,速度快,纠错能力强可以过滤不规则的html标记,支持多种中文字体,支持css。(There are many articles about exporting PDF from Java HTML, and most of them use iText. In fact, they know that iText can not meet our needs, can not be compatible with HTML style, and the pictures exported from HTML pages to PDF are not good enough to deal with. Flying Sauser HTML2PDF, error correction capability, support a variety of Chinese fonts (part can not be identified, and the style) of HTML format is very strict, if you use Flying Sauser technology to use a template if it is a good choice, but for the HTML export PDF is irregular and is not so. Then we have to consider the use of other technologies, and PD4ML can meet our needs, PD4ML to achieve HTML2PDF, fast, error correction ability, can filter irregular HTML tags, support a variety of Chinese fonts, support css.)
Platform: | Size: 405504 | Author: 瞳骸 | Hits:

[JSP/Javaword转换成html

Description: 利用apache poi开源项目做的word转html java程序(Word to HTML Java program made with Apache POI open source project)
Platform: | Size: 18198528 | Author: 如斯。 | Hits:

[WEB CodeHTML

Description: java大学生编程作业,web前端开发.网页用户注册,信息可选择(Java college students programming work, web front-end development. Web user registration, information can be selected)
Platform: | Size: 2048 | Author: 呼哈呼哈1 | Hits:

[JSP/Javajava毕业设计管理系统需求分析

Description: 技术可行性 本系统采用微软的JSP技术,Microsoft的Java Sever Pages(JSP)是服务器端脚本编写环境,使用它可以创建和运行动态、交互的WEB服务器应用程序。使用JSP可以组合HTML页、脚本命令和ActiveX组建以创建交互的Web页和基于Web的功能强大的应用程序。JSP应用程序很容易开发和维护。 经济可行性 本系统由于本身并不复杂,采用先进的JSP技术后,不需要投入太多的人力、物力,从而开发所需要的资金投入也不会很大,在经济上是完全可行的。 操作可行性分析 随着校园网的建成与发展,正是此系统大显身手的好机会,且此系统是在校园内部网上运行的。(technical feasibility The system adopts Microsoft's JSP technology, and Microsoft's Java Sever Pages (JSP) is the server side scripting environment. It can create and run dynamic and interactive WEB server applications. JSP can be used to combine HTML pages, script commands, and ActiveX to create an interactive Web page and a powerful Web based application. JSP applications are easy to develop and maintain. economic feasibility Because the system itself is not complicated, with the advanced JSP technology, it does not need to invest too much manpower and material resources, so that the development of the required capital investment will not be large, and is economically feasible. Operation feasibility analysis With the construction and development of the campus network, it is a good opportunity for this system to show great importance, and this system is running on the campus network.)
Platform: | Size: 70656 | Author: 李明666 | Hits:

[Other20160427-开发三部HTML+CSS培训资料

Description: 工业、 java、labview、 Mybatis、SpringMVC(With the rapid development of China's automobile industry, automobile production continues to expand the scale of powertrain development is the core of automotive research and development system, and the need to test, a large number of repeated, large amounts of data generated in the experiment is the core of the core. According to these data, engineers can powertrain optimization and design; and the current situation is generated by most domestic engine R & D center test data is discrete stored in various industrial control computer and in the form of structured and unstructured hybrid storage.)
Platform: | Size: 49694720 | Author: zj132136 | Hits:

[JSP/Javajava 面向对象程序设计

Description: java 学习课件,提供源代码,课件,家啊可别覅超市出售接受哎平均分何炅按啊你可能会(javahttp://www.pudn.com/Download/item/id/757512.html)
Platform: | Size: 2733056 | Author: 向前看123 | Hits:

[JSP/Javacontoh

Description: membuat aplikasi mobile menggunkan html untuk android
Platform: | Size: 1508352 | Author: saddamnur | Hits:

[WEB Coders

Description: 该程序为人事管理系统,在idea下用java写的Javaweb,功能未完善(This program is a personnel management system. Java Web written in Java under idea has not perfect function)
Platform: | Size: 3846144 | Author: Aimitee | Hits:
« 1 2 3 4 56 7 8 9 10 ... 50 »

CodeBus www.codebus.net